feat: subscribe to multiple audio tracks and add RFC 6464 audio level extension by martinadamsdev · Pull Request #68 · livekit/client-sdk-esp32

martinadamsdev · 2026-02-07T15:55:52Z

Summary

Fix multi-device audio: Remove single audio track subscription limit in subscribe_tracks(). Devices in the same room can now hear each other (previously only the first audio track was subscribed, causing device-to-device audio to fail while web clients worked fine).
Add RFC 6464 audio level extension: Inject ssrc-audio-level extmap into publisher SDP and RTP packets via esp_peer's RTP transformer API, enabling LiveKit Active Speaker detection.
Upgrade esp_peer dependency: ~1.2.7 → >=1.3.0 (required for esp_peer_set_rtp_transformer API).

Changes

File	What changed
`engine.c`	`session_state_t` now tracks up to 4 subscribed audio track SIDs with deduplication; `subscribe_tracks()` no longer breaks after first track; `handle_join()` no longer exits loop on `ENGINE_ERR_MAX_SUB`
`peer.c`	New `sdp_inject_audio_level_extmap()` for SDP patching; new RTP transformer callbacks for injecting RFC 5285/6464 audio level header extension; publisher offer SDP is patched before sending
`idf_component.yml`	`esp_peer` version bumped to `>=1.3.0`

Test plan

Two ESP32 devices (device-a, device-b) + web client join the same room
Serial logs show multiple "Subscribing to audio track" lines per device
Device-a can hear device-b and vice versa
Web client can still hear both devices and be heard by both
Serial logs show "Audio level RTP transformer enabled" on both devices

Summary by CodeRabbit

New Features
- Support for subscribing to multiple remote audio tracks per session with duplicate suppression and a configurable per-session limit
- Publisher-side audio-level RTP header extension injection (RFC 6464) with SDP negotiation to enable per-stream audio level reporting
Chores
- Updated dependency constraint for improved compatibility

CLAassistant · 2026-02-07T15:55:59Z

All committers have signed the CLA.

coderabbitai · 2026-02-07T15:56:21Z

Warning

Rate limit exceeded

@martinadamsdev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 26 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 402bab7 and 7809e0c.

📒 Files selected for processing (2)

components/livekit/core/peer.c
components/livekit/idf_component.yml

📝 Walkthrough

Walkthrough

Adds fixed-size multi-subscription support for audio tracks, implements RFC 6464 audio level RTP header extension injection and transformation for publishers, and updates the espressif/esp_peer dependency constraint.

Changes

Cohort / File(s)	Summary
Audio Track Subscription Support `components/livekit/core/engine.c`	Replaced single `sub_audio_track_sid` with `sub_audio_track_sids[MAX_AUDIO_SUBSCRIPTIONS]` and `sub_audio_track_count`; added `MAX_AUDIO_SUBSCRIPTIONS` (4); subscribe_tracks deduplicates, enforces limit, and stores multiple subscriptions; removed prior early-return/break behavior.
RFC 6464 Audio Level Extension `components/livekit/core/peer.c`	Added SDP extmap injection for audio media lines, constants for extmap/IDs, and static SDP injector; added RTP transformer callbacks (get_encoded_size, transform) that target Opus (PT 111) and append an 8-byte audio-level extension; registers transformer for publisher peers and logs injection outcomes.
Component Dependency Update `components/livekit/idf_component.yml`	Changed `espressif/esp_peer` constraint from `~1.2.7` to `>=1.3.0`.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant App as Publisher App
participant Peer as Peer (SDP generator)
participant Transformer as RTP Transformer
participant Remote as Remote Subscriber
App->>Peer: Request offer SDP (publisher)
Peer->>Peer: Inject RFC6464 extmap into audio m= line (if found)
Peer-->>App: Return patched SDP (or original if injection fails)
App->>Transformer: Send outbound RTP packets
Transformer->>Transformer: check PT==111 (Opus) and packet structure
Transformer->>Transformer: append 8-byte audio-level extension, set X-bit
Transformer-->>Remote: Transmit modified RTP packets

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I twitched my nose and counted four,

Subscriptions hopping, not one more,
I stitched a level in each beat,
Opus feet and rhythmic treat,
A rabbit’s cheer for audio lore!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly describes the two main features implemented: multiple audio track subscriptions and RFC 6464 audio level extension support, matching the core changes across engine.c and peer.c.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@components/livekit/core/peer.c`:
- Around line 40-44: The code currently injects a hardcoded RTP extmap ID and a
fixed audio level: update the injection logic that uses AUDIO_LEVEL_EXTMAP_LINE
and AUDIO_LEVEL_EXT_ID to first parse the SDP for existing "a=extmap:<id>"
entries and choose an unused numeric ID (or reuse the referred URI) instead of
always using 1 to avoid duplicate extmap IDs; also update the
AUDIO_LEVEL_DEFAULT constant comment and/or its use to clearly mark it as a
placeholder (e.g., "placeholder -30 dBov for testing") and, where RTP audio
level is generated, replace the fixed value with real VAD/level measurement or
call a helper that returns a measured level so active speaker detection reflects
actual speech activity.
- Around line 106-129: The function audio_level_get_encoded_size currently
hardcodes payload type 111 which breaks when Opus is negotiated to a different
dynamic PT; update the transformer to use a negotiated payload type value pulled
from configuration/state instead of 111 — e.g., add a payload_type field to the
transformer context (ctx) or to the peer state and populate it when parsing the
SDP/answer, then replace the literal 111 check in audio_level_get_encoded_size
with that context/state value (use esp_peer_rtp_frame_t and RTP_EXT_BLOCK_SIZE
as before) so the transformer works for any negotiated Opus payload type.

🧹 Nitpick comments (1)

components/livekit/idf_component.yml (1)
15-15: Unbounded version constraint risks pulling in future breaking changes.

Every other dependency in this file uses a ~ (compatible) constraint, but esp_peer now uses >=1.3.0 with no upper bound. A future major version bump (e.g., 2.0) could introduce breaking API changes. Consider using ~1.3.0 (or >=1.3.0, <2.0.0) to stay consistent and safe.
-  espressif/esp_peer: ">=1.3.0"
+  espressif/esp_peer: "~1.3.0"

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0b70346 and cccb4d7.

📒 Files selected for processing (3)

components/livekit/core/engine.c
components/livekit/core/peer.c
components/livekit/idf_component.yml

🧰 Additional context used

📓 Path-based instructions (1)

components/**

📄 CodeRabbit inference engine (AGENTS.md)

Place LiveKit ESP32 SDK sources and supporting components in the components/ directory

Files:

components/livekit/core/engine.c
components/livekit/idf_component.yml
components/livekit/core/peer.c

🧠 Learnings (1)

📚 Learning: 2026-01-31T02:18:39.165Z

Learnt from: CR
Repo: livekit/client-sdk-esp32 PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-31T02:18:39.165Z
Learning: Applies to components/** : Place LiveKit ESP32 SDK sources and supporting components in the `components/` directory

Applied to files:

components/livekit/idf_component.yml

🧬 Code graph analysis (1)

components/livekit/core/engine.c (1)

components/livekit/core/signaling.c (1)

signal_send_update_subscription (420-436)

🔇 Additional comments (7)

components/livekit/core/engine.c (3)

91-98: LGTM!

The fixed-size array with a count field is a clean approach for this constrained environment, and memset in cleanup_previous_connection properly resets the state on reconnect.

163-183: LGTM!

The deduplication logic and bounded array management look correct. The strncmp and strlcpy both properly use sizeof(livekit_pb_sid_t) as the limit.

710-716: LGTM!

Removing the early exit ensures all participants are visited. Once max subscriptions are reached, subsequent subscribe_tracks calls gracefully return after the limit check with minimal overhead.

components/livekit/core/peer.c (4)

51-104: LGTM!

The SDP injection logic correctly scopes its search to the audio media section, handles the boundary properly, and manages memory allocation with correct sizes. The fallback to NULL when no direction attribute is found is appropriate.

131-176: LGTM!

The RFC 5285 one-byte header extension block is correctly constructed: 0xBEDE profile, 1-word length, proper element header with ID and length nibbles, audio level data with voice-activity bit, and padding to word boundary. The payload copy arithmetic is correct.

286-310: LGTM!

The patched SDP path correctly frees the allocated string after forwarding, and the fallback to the original SDP on injection failure is sound.

486-501: LGTM!

The transformer registration is correctly gated on publisher role with audio enabled, and gracefully degrades if the API call fails without blocking peer creation.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

components/livekit/core/peer.c

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@components/livekit/core/peer.c`:
- Around line 139-171: In audio_level_transform, add a defensive NULL check for
frame->encoded_data (enc) before any writes/reads and return a non-zero error
(or propagate frame failure) if it's NULL; ensure you guard the subsequent
memcpy/enc writes and avoid modifying enc[0] etc. If available, log or set an
error code on the frame to aid debugging; keep the check at the top of
audio_level_transform immediately after assigning enc.

🧹 Nitpick comments (3)

components/livekit/core/engine.c (1)

710-716: Verify: subscribe_tracks errors are silently ignored here.

The return value of subscribe_tracks() is now discarded. This is fine for ENGINE_ERR_MAX_SUB (subsequent participants' tracks just log a warning), but ENGINE_ERR_INVALID_ARG would also be silently swallowed. In practice tracks comes from protobuf data that could have tracks_count > 0 with a valid pointer, so this is low risk, but worth being aware of.
components/livekit/core/peer.c (2)
51-104: SDP attribute search doesn't enforce line-boundary matching.

strstr(m_audio, "a=sendrecv") could match a substring within another attribute value (e.g., inside an a=fmtp: line). In practice SDP attributes are well-formed so this is very unlikely, but using a line-boundary-aware search (e.g., searching for "\r\na=sendrecv" or "\na=sendrecv") would be more robust.

Also, if the audio section uses a=inactive, the function returns NULL and falls back to the original SDP, which is reasonable.

290-310: Full SDP logged at INFO level may be noisy in production.

Lines 297 and 302 log the entire SDP string with ESP_LOGI. On embedded targets with limited serial bandwidth, consider using ESP_LOGD (debug level) to match the pattern used elsewhere in this file for verbose output.
♻️ Suggested change
-                    ESP_LOGI(TAG(peer), "Generated offer (with audio-level extmap):\n%s", patched_sdp);
+                    ESP_LOGD(TAG(peer), "Generated offer (with audio-level extmap):\n%s", patched_sdp);
-                    ESP_LOGI(TAG(peer), "Generated offer:\n%s", sdp);
+                    ESP_LOGD(TAG(peer), "Generated offer:\n%s", sdp);

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cccb4d7 and 402bab7.

📒 Files selected for processing (3)

components/livekit/core/engine.c
components/livekit/core/peer.c
components/livekit/idf_component.yml

🚧 Files skipped from review as they are similar to previous changes (1)

components/livekit/idf_component.yml

🧰 Additional context used

📓 Path-based instructions (1)

components/**

📄 CodeRabbit inference engine (AGENTS.md)

Place LiveKit ESP32 SDK sources and supporting components in the components/ directory

Files:

components/livekit/core/peer.c
components/livekit/core/engine.c

🧬 Code graph analysis (1)

components/livekit/core/engine.c (1)

components/livekit/core/signaling.c (1)

signal_send_update_subscription (420-436)

🔇 Additional comments (3)

components/livekit/core/engine.c (2)

91-98: LGTM — clean fixed-size multi-subscription state.

The MAX_AUDIO_SUBSCRIPTIONS limit with a parallel count field is a straightforward approach for this embedded context. memset(&eng->session, 0, ...) in cleanup_previous_connection (Line 766) correctly resets both the array and the count on reconnect.

153-186: LGTM — deduplication and subscription logic is correct.

The dedup check before the capacity check is properly ordered (avoids counting an already-subscribed track against the limit). strlcpy safely handles SIDs that might exceed the buffer. One minor note: when ENGINE_ERR_MAX_SUB is returned at Line 177, any remaining tracks in the list (even if they'd be deduplicated/skipped) are not iterated, but since only audio tracks are relevant and the limit is legitimately reached, this is acceptable behavior.

components/livekit/core/peer.c (1)

486-501: LGTM — transformer registration is properly guarded.

Only registered for publisher role with a non-NONE audio codec. Failure is logged as a warning without blocking peer creation, which is the right trade-off for an optional feature.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

components/livekit/core/peer.c

ladvoc · 2026-02-10T02:44:01Z

Hi @martinadamsdev, thanks for your contribution! Since this requires a minor version bump of esp_peer, I’ve opened a separate PR for that. Once that is merged, we can move forward with this PR.

… extension When multiple devices join the same LiveKit room, they cannot hear each other because subscribe_tracks() only subscribes to the first audio track and breaks. This removes the single-track limitation, allowing subscription to up to MAX_AUDIO_SUBSCRIPTIONS (4) audio tracks with deduplication. Additionally, injects RFC 6464 ssrc-audio-level RTP header extension into publisher SDP and outgoing RTP packets to support LiveKit Active Speaker detection. Requires esp_peer >= 1.3.0 for the RTP transformer API.

martinadamsdev · 2026-02-11T11:44:12Z

Hi @ladvoc, thanks for splitting out the esp_peer upgrade into #70!

Now that #70 is merged, I've rebased this PR onto the latest main and removed the idf_component.yml change (since it's already handled). The diff is now limited to engine.c and peer.c only.

Ready for your review when you get a chance!

martinadamsdev force-pushed the feat/multi-audio-track-subscription branch from cccb4d7 to 402bab7 Compare February 7, 2026 15:58

coderabbitai bot reviewed Feb 7, 2026

View reviewed changes

components/livekit/core/peer.c Show resolved Hide resolved

components/livekit/core/peer.c Show resolved Hide resolved

coderabbitai bot reviewed Feb 7, 2026

View reviewed changes

components/livekit/core/peer.c Show resolved Hide resolved

martinadamsdev force-pushed the feat/multi-audio-track-subscription branch from 7809e0c to df19489 Compare February 11, 2026 11:38

martinadamsdev force-pushed the feat/multi-audio-track-subscription branch from df19489 to ecb1557 Compare February 11, 2026 11:40

ladvoc mentioned this pull request Feb 17, 2026

when multiple users are connected, device can not hear them #43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: subscribe to multiple audio tracks and add RFC 6464 audio level extension#68

feat: subscribe to multiple audio tracks and add RFC 6464 audio level extension#68
martinadamsdev wants to merge 1 commit intolivekit:mainfrom
martinadamsdev:feat/multi-audio-track-subscription

martinadamsdev commented Feb 7, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Feb 7, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Feb 7, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated Code Review Effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

ladvoc commented Feb 10, 2026

Uh oh!

martinadamsdev commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

martinadamsdev commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Summary by CodeRabbit

Uh oh!

CLAassistant commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated Code Review Effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ladvoc commented Feb 10, 2026

Uh oh!

martinadamsdev commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

martinadamsdev commented Feb 7, 2026 •

edited

Loading

CLAassistant commented Feb 7, 2026 •

edited

Loading

coderabbitai bot commented Feb 7, 2026 •

edited

Loading